Statistical power to detect genetic and environmental influences in the presence of data missing at random.
نویسندگان
چکیده
We study the situation in which a cheap measure (X) is observed in a large, representative twin sample, and a more expensive measure (Y) is observed in a selected subsample. The aim of this study is to investigate the optimal selection design in terms of the statistical power to detect genetic and environmental influences on the variance of Y and on the covariance of X and Y. Data were simulated for 4000 dizygotic and 2000 monozygotic twins. Missingness (87% vs. 97%) was then introduced in accordance with 7 selection designs: (i) concordant low + individual high design; (ii) extreme concordant design; (iii) extreme concordant and discordant design (EDAC); (iv) extreme discordant design; (v) individual score selection design; (vi) selection of an optimal number of MZ and DZ twins; and (vii) missing completely at random. The statistical power to detect the influence of additive and dominant genetic and shared environmental effects on the variance of Y and on the covariance between X and Y was investigated. The best selection design is the individual score selection design. The power to detect additive genetic effects is high irrespective of the percentage of missingness or selection design. The power to detect shared environmental effects is acceptable when the percentage of missingness is 87%, but is low when the percentage of missingness is 97%, except for the individual score selection design, in which the power remains acceptable. The power to detect D is low, irrespective of selection design or percentage of missingness. The individual score selection design is therefore the best design for detecting genetic and environmental influences on the variance of Y and on the covariance of X and Y. However, the EDAC design may be preferred when an additional purpose of a study is to detect quantitative trait loci effects.
منابع مشابه
Marginal Analysis of A Population-Based Genetic Association Study of Quantitative Traits with Incomplete Longitudinal Data
A common study to investigate gene-environment interaction is designed to be longitudinal and population-based. Data arising from longitudinal association studies often contain missing responses. Naive analysis without taking missingness into account may produce invalid inference, especially when the missing data mechanism depends on the response process. To address this issue in the ana...
متن کاملInfluence of Pattern of Missing Data on Performance of Imputation Methods: An Example from National Data on Drug Injection in Prisons
Background Policy makers need models to be able to detect groups at high risk of HIV infection. Incomplete records and dirty data are frequently seen in national data sets. Presence of missing data challenges the practice of model development. Several studies suggested that performance of imputation methods is acceptable when missing rate is moderate. One of the issues which was of less concern...
متن کاملDetermination of the genetic and non-genetic variations in growth curve of Zandi lambs by random regression models
The aim of this study was to model the variances and covariances of body weight in Zandi sheep from 60 to 365 days of age using random regression models (RRM). Legendre polynomials of different orders were used to model the direct and maternal covariances. Mean trends were also modeled through a quadratic regression on orthogonal polynomials of age. Homogeneity and heterogeneity of the residual...
متن کاملExperimental investigation, modeling, and optimization of combined electro-(fenton/coagulation/flotation) process: design of experiments and artificial intelligence systems
In this study, a combined electro-(Fenton/coagulation/flotation) (EF/EC/El) process was studied via degradation of Disperse Orange 25 (DO25) organic dye as a case study. Influences of seven operational parameters on the dye removal efficiency (DR%) were measured: initial pH of the solution (pH0), applied voltage between the anode and cathode (V), initial ferrous ion concentration (CFe), initial...
متن کاملRandom regression models for estimation of covariance functions of growth in Iranian Kurdi sheep
Body weight (BW) records (n=11,659) of 4961 Kurdi sheep from 215 sires and 2085 dams were used to estimate the additive genetic, direct and maternal permanent environmental effects on growth from 1 to 300 days of age. The data were collected from 1993 to 2015 at a breeding station in North Khorasan province; Iran. Genetic parameters for growth traits were estimated using random regression test-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Twin research and human genetics : the official journal of the International Society for Twin Studies
دوره 10 1 شماره
صفحات -
تاریخ انتشار 2007